CFunModel: A "Funny" Language Model Capable of Chinese Humor Generation and Processing

  • 2025-03-26 11:44:51
  • Zhenghan Yu, Xinyu Hu, Xiaojun Wan
  • 0

Abstract

Humor plays a significant role in daily language communication. With therapid development of large language models (LLMs), natural language processinghas made significant strides in understanding and generating various genres oftexts. However, most LLMs exhibit poor performance in generating and processingChinese humor. In this study, we introduce a comprehensive Chinesehumor-related dataset, the Chinese Fun Set (CFunSet). This dataset aggregatesexisting Chinese humor datasets and includes over 20,000 jokes collected fromTieba-JokeBar, a Chinese online platform known for joke sharing. The resultingcorpus comprises more than 160,000 entries. Leveraging CFunSet, we developedthe Chinese Fun Model (CFunModel), the first large language model designed tohandle various Chinese humor-related tasks including Crosstalk ResponseSelection, Humor Recognition, Joke Generation, etc. Experimental resultsdemonstrate that CFunModel outperforms popular large language models in thesetasks. Our CFunSet is available athttps://huggingface.co/datasets/ZhenghanYU/CFunSet and CFunModel is availableat https://huggingface.co/ZhenghanYU/CFunModel. A demostration video of ourwork is available at https://youtu.be/MOsISOJ66Ms.

 

Quick Read (beta)

loading the full paper ...